Automatic Bug Triage using Semi-Supervised Text Classification

نویسندگان

  • Jifeng Xuan
  • He Jiang
  • Zhilei Ren
  • Jun Yan
  • Zhongxuan Luo
چکیده

In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectationmaximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy. Keywordsautomatic bug triage; expectation-maximization; semi-supervised text classification; weighted recommendation list

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic bug triage using text categorization

Bug triage, deciding what to do with an incoming bug report, is taking up increasing amount of developer resources in large open-source projects. In this paper, we propose to apply machine learning techniques to assist in bug triage by using text categorization to predict the developer that should work on the bug based on the bug’s description. We demonstrate our approach on a collection of 15,...

متن کامل

Analysis of Bug Triage using Data Preprocessing (Reduction) Techniques

In the bug triage we have an unavoidable step of fixing the bugs which helps in correctly assigning a developer to a new bug. Text classification and binary classification techniques are applied to decrease the time cost in manual work and to enhance the working of automatic bug triage. We address the problem of data reduction and hence we combine the instance selection and the feature selectio...

متن کامل

Analysis of Bug Triage using Data Preprocessing (Reduction) Techniques

In the bug triage we have an unavoidable step of fixing the bugs which helps in correctly assigning a developer to a new bug. Text classification and binary classification techniques are applied to decrease the time cost in manual work and to enhance the working of automatic bug triage. We address the problem of data reduction and hence we combine the instance selection and the feature selectio...

متن کامل

A Novel Multi label Text Classification Model using Semi supervised learning

Automatic text categorization (ATC) is a prominent research area within Information retrieval. Through this paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised le...

متن کامل

Automatic Text Classification: A Technical Review

Automatic Text Classification is a semi-supervised machine learning task that automatically assigns a given document to a set of pre-defined categories based on its textual content and extracted features. Automatic Text Classification has important applications in content management, contextual search, opinion mining, product review analysis, spam filtering and text sentiment mining. This paper...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010